MSNovo: a dynamic programming algorithm for de novo peptide sequencing via tandem mass spectrometry.

نویسندگان

  • Lijuan Mo
  • Debojyoti Dutta
  • Yunhu Wan
  • Ting Chen
چکیده

Tandem mass spectrometry (MS/MS) has become the experimental method of choice for high-throughput proteomics-based biological discovery. The two primary ways of analyzing MS/MS data are database search and de novo sequencing. In this paper, we present a new approach to peptide de novo sequencing, called MSNovo, which has the following advanced features. (1) It works on data generated from both LCQ and LTQ mass spectrometers and interprets singly, doubly, and triply charged ions. (2) It integrates a new probabilistic scoring function with a mass array-based dynamic programming algorithm. The simplicity of the scoring function, with only 6-10 parameters to be trained, avoids the problem of overfitting and allows MSNovo to be adopted for other machines and data sets easily. The mass array data structure explicitly encodes all possible peptides and allows the dynamic programming algorithm to find the best peptide. (3) Compared to existing programs, MSNovo predicts peptides as well as sequence tags with a higher accuracy, which is important for those applications that search protein databases using the de novo sequencing results. More specifically, we show that MSNovo outperforms other programs on various ESI ion trap data. We also show that for high-resolution data the performance of MSNovo improves significantly. Supporting Information, executable files and data sets can be found at http://msms.usc.edu/supplementary/msnovo.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

AuDeNS: A Tool for Automatic De Novo Peptide Sequencing

We have developed and implemented a framework for de novo sequencing of peptides using tandem mass spectrometry data. It first cleans the input spectrum with a number of data cleaning algorithms (“grass mowers”), followed by a sequencing algorithm that is a modification of a dynamic programming algorithm introduced in [CKT00]. In first experiments, our prototype performs well (but not better) i...

متن کامل

De Novo Peptide Identification Via Mixed-Integer Linear Optimization And Tandem Mass Spectrometry

A novel methodology for the de novo identification of peptides via mixedinteger linear optimization (MILP) and tandem mass spectrometry is presented. The overall mathematical model is presented and the key concepts of the proposed approach are described. A pre-processing algorithm is utilized to identify important m/z values in the tandem mass spectrum. Missing peaks, due to residue-dependent f...

متن کامل

A Two-way Parallel Searching for Peptide Identification via Tandem Mass Spectrometry

De novo peptide sequencing that determines the amino acid sequence of a peptide via tandem mass spectrometry (MS/MS) has been increasingly used nowadays in proteomics for protein identification. Current de novo methods generally employ a graph theory, which usually produces a large number of candidate sequences and causes heavy computational cost while trying to determine a sequence with less a...

متن کامل

PepNovo: de novo peptide sequencing via probabilistic network modeling.

We present a novel scoring method for de novo interpretation of peptides from tandem mass spectrometry data. Our scoring method uses a probabilistic network whose structure reflects the chemical and physical rules that govern the peptide fragmentation. We use a likelihood ratio hypothesis test to determine whether the peaks observed in the mass spectrum are more likely to have been produced und...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Analytical chemistry

دوره 79 13  شماره 

صفحات  -

تاریخ انتشار 2007